Add histogram conversion test cases and test data #374

dricross · 2025-10-17T19:56:14Z

Description

Creating test cases for testing the classic histogram mapping algorithm. This includes both raw datasets (simple values) for testing accuracy and synthetic histograms for testing correctness.

I put this in a new Go package under pkg/aws as we want to use the mapping algorithm in both opentelemetry-collector-contrib and amazon-cloudwatch-agent.

Testing

New unit tests

Unit tests for receiver/elasticsearchreceiver are failing on GitHub runner (they pass locally):

=== FAIL: . TestIntegration/7.16.3 (83.92s)
    scraperint.go:113: 
        	Error Trace:	/home/runner/work/opentelemetry-collector-contrib/opentelemetry-collector-contrib/internal/coreinternal/scraperinttest/scraperint.go:113
        	Error:      	Condition never satisfied
        	Test:       	TestIntegration/7.16.3
    scraperint.go:95: number of resources doesn't match expected: 4, actual: 0

These changes should have no impact on that receiver as this PR introduces and entirely new, unreferenced module (outside of version.yaml).

The same test is failing for aws-cwa-dev as well: https://github.com/amazon-contributing/opentelemetry-collector-contrib/actions/runs/18651721544

pkg/aws/tools/generator.go

pkg/aws/cloudwatch/histograms/test_cases_test.go

pkg/aws/cloudwatch/histograms/histograms.go

pkg/aws/cloudwatch/histograms/test_cases.go

movence · 2025-10-21T19:20:32Z

pkg/aws/cloudwatch/histograms/histograms.go

+	}
+
+	if dp.HasMax() {
+		if math.IsNaN(dp.Max()) {


consider making this a helper function to minimize duplication

I'll make a helper

movence · 2025-10-21T19:26:09Z

pkg/aws/cloudwatch/histograms/histograms_test.go

+	"go.opentelemetry.io/collector/pdata/pmetric"
+)
+
+func TestCheckValidity(t *testing.T) {


could add test cases for min/max

Good catch, will add.

pkg/aws/tools/generator.go

movence · 2025-10-21T19:46:21Z

pkg/aws/tools/generator.go

+	var datasets []GeneratedDataset
+	for _, config := range configs {
+		// each dataset gets a copy of the same rand so that they don't interfere with each other
+		rng := rand.New(rand.NewPCG(seed, seed))


How much randomness do we need and how significant is it? what happens if each config shares the same rand?

What we want is to be able to generate data that reasonably resembles some data distribution. We want that generated data to be reliably repeatable so that we can write integration tests around it. We could hard code all of the values, but we decided to use a set seed on a random number generator instead. If the configs share the same rand, then they will interfere with each other making the generated data not reliably repeatable.

Oh yeah, I'm not questioning why use seed but why each config would require a new rand. They are diff distributions already so having the same rand is fine?

movence · 2025-10-21T19:47:36Z

pkg/aws/tools/generator.go

+}
+
+func gammaRandom(shape float64, rng *rand.Rand) float64 {
+	if shape < 1 {


is this written from scratch? some comments or a source will be helpful.

It is. I'll add some comments

movence · 2025-10-21T19:50:57Z

pkg/aws/cloudwatch/histograms/test_cases_test.go

+
+			assertOptionalFloat(t, "min", tc.Expected.Min, tc.Input.Min)
+			assertOptionalFloat(t, "max", tc.Expected.Max, tc.Input.Max)
+			assert.InDelta(t, tc.Expected.Average, tc.Input.Sum/float64(tc.Input.Count), 0.01)


is delta calc conservative or more releaxed? just asking since we have seen some flakiness with tight delta calcs before

It's more relaxed than direct floating point comparisons, but its more conservative than a % diff check (e.g. (expected-actual)/actual).

movence · 2025-10-21T19:56:24Z

pkg/aws/cloudwatch/histograms/test_cases_test.go

+	}
+
+	// Rest of checks only apply if we have boundaries/counts
+	if len(hi.Boundaries) > 0 || len(hi.Counts) > 0 {


maybe create local vars with lens of these since they are used a quite few in thie block

agarakan · 2025-10-24T17:27:13Z

pkg/aws/cloudwatch/datatypes.go

+
+package cloudwatch // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/aws/cloudwatch"
+
+// HistogramDataPoint is a single data point that describes the values of a Histogram.


Nit: Block quotes are more readable

agarakan · 2025-10-24T17:27:35Z

pkg/aws/cloudwatch/histograms/histograms.go

+// Copyright The OpenTelemetry Authors
+// SPDX-License-Identifier: Apache-2.0
+
+package histograms // import "github.com/open-telemetry/opentelemetry-collector-contrib/pkg/aws/cloudwatch/histograms"


Ultra small Nit: excessive comments?

dricross added 3 commits October 17, 2025 15:41

Add histogram conversion test cases and test data

fac3493

make goporto

0c54bd0

multimod-verify

054efd0

dricross marked this pull request as ready for review October 20, 2025 12:38

Merge branch 'aws-cwa-dev' into classichistograms

6397f24